AITopics | translation equivariance

Graph neural networks (GNNs) are commonly described as being permutation equivariant with respect to node relabeling in the graph. This symmetry of GNNs is often compared to the translation equivariance of Euclidean convolution neural networks (CNNs). However, these two symmetries are fundamentally different: The translation equivariance of CNNs corresponds to symmetries of the fixed domain acting on the image signals (sometimes known as active symmetries), whereas in GNNs any permutation acts on both the graph signals and the graph domain (sometimes described as passive symmetries). In this work, we focus on the active symmetries of GNNs, by considering a learning setting where signals are supported on a fixed graph. In this case, the natural symmetries of GNNs are the automorphisms of the graph. Since real-world graphs tend to be asymmetric, we relax the notion of symmetries by formalizing approximate symmetries via graph coarsening. We present a bias-variance formula that quantifies the tradeoff between the loss in expressivity and the gain in the regularity of the learned estimator, depending on the chosen symmetry group. To illustrate our approach, we conduct extensive experiments on image inpainting, traffic flow prediction, and human pose estimation with different choices of symmetries. We show theoretically and empirically that the best generalization performance can be achieved by choosing a suitably larger group than the graph automorphism, but smaller than the permutation group.

equivariant graph network, name change, symmetry, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

Approximately Equivariant Graph Networks

Neural Information Processing SystemsJan-19-2025, 03:27:12 GMT

Graph neural networks (GNNs) are commonly described as being permutation equivariant with respect to node relabeling in the graph. This symmetry of GNNs is often compared to the translation equivariance of Euclidean convolution neural networks (CNNs). However, these two symmetries are fundamentally different: The translation equivariance of CNNs corresponds to symmetries of the fixed domain acting on the image signals (sometimes known as active symmetries), whereas in GNNs any permutation acts on both the graph signals and the graph domain (sometimes described as passive symmetries). In this work, we focus on the active symmetries of GNNs, by considering a learning setting where signals are supported on a fixed graph. In this case, the natural symmetries of GNNs are the automorphisms of the graph.

equivariant graph network, symmetry, translation equivariance, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Convolutional Conditional Neural Processes

Bruinsma, Wessel P.

arXiv.org Machine LearningAug-18-2024

Neural processes are a family of models which use neural networks to directly parametrise a map from data sets to predictions. Directly parametrising this map enables the use of expressive neural networks in small-data problems where neural networks would traditionally overfit. Neural processes can produce well-calibrated uncertainties, effectively deal with missing data, and are simple to train. These properties make this family of models appealing for a breadth of applications areas, such as healthcare or environmental sciences. This thesis advances neural processes in three ways. First, we propose convolutional neural processes (ConvNPs). ConvNPs improve data efficiency of neural processes by building in a symmetry called translation equivariance. ConvNPs rely on convolutional neural networks rather than multi-layer perceptrons. Second, we propose Gaussian neural processes (GNPs). GNPs directly parametrise dependencies in the predictions of a neural process. Current approaches to modelling dependencies in the predictions depend on a latent variable, which consequently requires approximate inference, undermining the simplicity of the approach. Third, we propose autoregressive conditional neural processes (AR CNPs). AR CNPs train a neural process without any modifications to the model or training procedure and, at test time, roll out the model in an autoregressive fashion. AR CNPs equip the neural process framework with a new knob where modelling complexity and computational expense at training time can be traded for computational expense at test time. In addition to methodological advancements, this thesis also proposes a software abstraction that enables a compositional approach to implementing neural processes. This approach allows the user to rapidly explore the space of neural process models by putting together elementary building blocks in different ways.

average kullback-leibler divergence, non-convolutional neural process, original conditional neural process, (14 more...)

arXiv.org Machine Learning

doi: 10.17863/CAM.100216

2408.09583

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.13)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Translation Equivariant Transformer Neural Processes

Ashman, Matthew, Diaconu, Cristiana, Kim, Junhyuck, Sivaraya, Lakee, Markou, Stratis, Requeima, James, Bruinsma, Wessel P., Turner, Richard E.

arXiv.org Machine LearningJun-18-2024

The effectiveness of neural processes (NPs) in modelling posterior prediction maps -- the mapping from data to posterior predictive distributions -- has significantly improved since their inception. This improvement can be attributed to two principal factors: (1) advancements in the architecture of permutation invariant set functions, which are intrinsic to all NPs; and (2) leveraging symmetries present in the true posterior predictive map, which are problem dependent. Transformers are a notable development in permutation invariant set functions, and their utility within NPs has been demonstrated through the family of models we refer to as TNPs. Despite significant interest in TNPs, little attention has been given to incorporating symmetries. Notably, the posterior prediction maps for data that are stationary -- a common assumption in spatio-temporal modelling -- exhibit translation equivariance. In this paper, we introduce of a new family of translation equivariant TNPs that incorporate translation equivariance. Through an extensive range of experiments on synthetic and real-world spatio-temporal data, we demonstrate the effectiveness of TE-TNPs relative to their non-translation-equivariant counterparts and other NP baselines.

dataset, opération, translation equivariance, (6 more...)

arXiv.org Machine Learning

2406.12409

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > Canada > Ontario > Toronto (0.14)
(5 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Convolutional layers are equivariant to discrete shifts but not continuous translations

McGreivy, Nick, Hakim, Ammar

arXiv.org Artificial IntelligenceDec-6-2023

The purpose of this short and simple note is to clarify a common misconception about convolutional neural networks (CNNs). CNNs are made up of convolutional layers which are shift equivariant due to weight sharing. However, convolutional layers are not translation equivariant, even when boundary effects are ignored and when pooling and subsampling are absent. This is because shift equivariance is a discrete symmetry while translation equivariance is a continuous symmetry. This fact is well known among researchers in equivariant machine learning, but is usually overlooked among non-experts. To minimize confusion, we suggest using the term `shift equivariance' to refer to discrete shifts in pixels and `translation equivariance' to refer to continuous translations.

convolutional layer, equivariance, equivariant, (11 more...)

arXiv.org Artificial Intelligence

2206.04979

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.05)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

What Affects Learned Equivariance in Deep Image Recognition Models?

Bruintjes, Robert-Jan, Motyka, Tomasz, van Gemert, Jan

arXiv.org Artificial IntelligenceApr-7-2023

Equivariance w.r.t. geometric transformations in neural networks improves data efficiency, parameter efficiency and robustness to out-of-domain perspective shifts. When equivariance is not designed into a neural network, the network can still learn equivariant functions from the data. We quantify this learned equivariance, by proposing an improved measure for equivariance. We find evidence for a correlation between learned translation equivariance and validation accuracy on ImageNet. We therefore investigate what can increase the learned equivariance in neural networks, and find that data augmentation, reduced model capacity and inductive bias in the form of convolutions induce higher learned equivariance in neural networks.

equivariance, machine learning, pattern recognition, (15 more...)

arXiv.org Artificial Intelligence

2304.02628

Country:

Europe > Netherlands > South Holland > Delft (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
Africa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.41)

Add feedback

Exploration into Translation-Equivariant Image Quantization

Shin, Woncheol, Lee, Gyubok, Lee, Jiyoung, Lyou, Eunyi, Lee, Joonseok, Choi, Edward

arXiv.org Artificial IntelligenceFeb-26-2023

This is an exploratory study that discovers the current image quantization (vector quantization) do not satisfy translation equivariance in the quantized space due to aliasing. Instead of focusing on anti-aliasing, we propose a simple yet effective way to achieve translation-equivariant image quantization by enforcing orthogonality among the codebook embeddings. To explore the advantages of translation-equivariant image quantization, we conduct three proof-of-concept experiments with a carefully controlled dataset: (1) text-to-image generation, where the quantized image indices are the target to predict, (2) image-to-text generation, where the quantized image indices are given as a condition, (3) using a smaller training set to analyze sample efficiency. From the strictly controlled experiments, we empirically verify that the translation-equivariant image quantizer improves not only sample efficiency but also the accuracy over VQGAN up to +11.9% in text-to-image generation and +3.9% in image-to-text generation.

equivariance, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2112.00384

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Asia > India (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > Experimental Study (0.54)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Alias-Free Generative Adversarial Networks

Karras, Tero, Aittala, Miika, Laine, Samuli, Härkönen, Erik, Hellsten, Janne, Lehtinen, Jaakko, Aila, Timo

arXiv.org Artificial IntelligenceJul-15-2021

We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation.

equivariance, generator, proc, (16 more...)

arXiv.org Artificial Intelligence

2106.12423

Country: